Performance and Monitoring Gigabit Network Disks
Contents
Summary
There are all sorts of complications on benchmarking disk reading and writing, more so when using network or bus data transfers. These include use of Windows File Cache, data written after Windows has indicated completion, large amounts of extra data written, excessive data returned from destination, required data not read from disk, exceptionally slow performance with small files, peculiar data transmission speed patterns, unexpected errors and different versions of Windows not behaving in the same way. Some of these can be identified and explained by using Perfmon Performance Monitor.
Benchmarks and file copying tests were run on and between three 64 bit systems using gigabit LAN connections. The three PCs use Athlon 64 X2/Windows XP (AMD2), Core 2 Duo/Vista (C2D) and Phenom II X4/Windows 7 (AMD4). For many of the tests, Performance Monitor (Perfmon) was used to identify disk, LAN and CPU utilisation. In most cases, CPU speed appears not to be an issue. AVAST anti-virus software was active on all systems.
At least the tests show that data can be sent at greater than 90 MB/second with large files, but more often limited by disk speed.
Caching - The benchmarks used can be run without or with data flowing through Windows RAM based File Cache, in this case the consideration being the cache on the destination PC. Measured speeds showed that AMD2 cached the data when told not to do so. When not cached, Windows indicates completion when file writing is finished. With the cached alternative, and in association with default disk Optimise For Performance properties, data can be written to disk after the indicated completion time, and slowly via Lazy Writing. Copy/Paste also involves caching, where the progress window closes before all data is written. With caching, benchmark read after write or copy with verify, the data might be read from RAM on the destination system. Another complication is that data can be cached in RAM on the source computer, where, on a repeat of copying, the disk is not read. This can be useful in showing whether reading has a significant influence on performance.
Extra Data - Extra data is written to disk for NTFS Volume Log and SearchIndexer, the amount varying by version of Windows and apparently excessive when large numbers of small files are transferred. With LAN data transfers even more data is written, with Perfmon Resource Overview suggesting that data to some files is written more than once. Using AMD4/Windows 7 and C2D/Vista to copy certain files, more than twice the data sent is returned from the destination PC but this does not occur when using AMD2/XP. Normally, AVAST reads the data (> twice) from File Cache on the PC where the file is written but the extra LAN data is due to a second reading from the source PC.
The type of files affected by AVAST include .PDF, .DOC, .HTM and .XLS but not .TXT and .ZIP. Tests run using AVG anti-virus software did not produce the extra LAN traffic.
Small Files - Copying or writing small files via the LAN is particularly slow due to high overheads. This is influenced by using small packet sizes and relatively vast amounts of extra data. Some test results show that using USB can be twice as fast as Gigabit LAN. In this case, a batched read, compress (ZIP) and write can be four times faster.
Peculiarities - These include Vista and Windows 7 appearing to see main data packets as different sizes and AMD2/XP sometimes being particularly slow with low CPU utilisation but misses some Perfmon recording samples, as though it is too busy. Then, with AMD2 sending, there is some output queuing and data flow is alternately fast and slow over adjacent seconds. Finally, with AMD4 sending lots of same sized files to C2D, the benchmark sometimes stops on reading, indicating that a file cannot be found.
To Start
General
Benchmarks and file copying tests were run on the following systems to measure performance using gigabit LAN connections via a Netgear Ethernet Switch. All PCs had 64-Bit versions of Windows installed and AVAST anti-virus software, which was active for most tests.
AMD2 - Athlon 64 X2 Dual Core 4200+ 2.21 GHz, Asus A8N-SLI Deluxe, 1 GB DCDDR RAM, 300 GB Maxtor disk 7200 RPM SATA, NVIDIA nForce Networking Controller, WinXP Pro x64.
C2D - Core 2 Duo 2.4 GHz, Asus P5B, 4 GB DDR2 800 MHz, Seagate ST3400633AS SATA2 Disk 400 GB, Realtek PCI-E Gigabit Ethernet NIC, 64 Bit Vista.
AMD4 - Phenom II X4 Quad Core 945 3.0 GHz, Asus M4A785TD-V, 8 GB DCDDR3 RAM, WD 5400 RPM Green SATA disk, Realtek PCIe GBE Family Controller, 64-Bit Windows 7.
To Start
DiskGraf Benchmark
DiskGraf Benchmark measures disk writing and reading speeds and CPU utilisation at different block sizes. Details can be found in
DiskGraf Results.htm.
Results are for the 64 bit version that accurately measures CPU utilisation of multi-core processors. Here, 100% indicates that all cores are in use at the same time and, with four cores, 25% is equivalent to 100% of one CPU.
This benchmark uses files on the PC containing the EXE code. So, the remote system is executing this code and recorded CPU utilisation is shown for this.
The results report includes speeds via 100 Mbps LAN connection, where maximum speeds are no greater than 11 MBytes/second. One example is shown below. The benchmark was run using the default file size of 8 MB, where overheads can reduce maximum throughput. To show the latter, locally run DiskGraf speeds are shown (run at the same time). Overheads also result in higher CPU utilisation and slower speeds using small block sizes.
In most cases, data transfer rate could be expected to be limited by disk speed using 1000 Gbps Ethernet, but overheads would reduce this. The first two results, involving AMD4 and C2D, look reasonable, where maximum transfer rates are more than 85% of that measured for the disk, except for the AMD4 disk which is faster than the LAN. In this and other cases, C2D CPU utilisation is high.
Writing speeds, and some for reading, are slower than might be expected where AMD2 is involved. In three cases, reading speed via LAN looks to be too high. The benchmark program uses FILE_FLAG_NO_BUFFERING in the CreateFile function to tell Windows not to keep the data in main memory based File Cache but, as shown below under Performance Monitoring, the program can run without reading data from disk.
To Start
100 Mbps LAN
Block KB 1 2 4 8 16 32 64 128 256 512 1024
Run on AMD4 Files C2D
Write MB/s 1 2 3 3 6 7 9 10 10 11 11
% CPU Ut 7 7 7 7 5 5 4 6 6 5 5 x4
Read MB/s 1 2 3 3 6 7 8 9 10 11 11
% CPU Ut 7 4 5 5 5 5 5 6 4 5 4 x4
1000 Mbps LAN Disk
Max
Block KB 1 2 4 8 16 32 64 128 256 512 1024 MB/s
Run on AMD4 Files C2D
Write MB/s 3 5 9 17 22 29 32 39 36 35 35 41
% CPU Ut 10 11 11 8 7 7 7 6 3 4 6 x4
Read MB/s 3 5 10 19 26 35 49 52 48 42 46 53
% CPU Ut 9 7 10 10 7 9 9 9 8 9 6 x4
Run on C2D Files AMD4
Write MB/s 2 4 7 12 18 28 35 42 49 60 60 68 Fast
% CPU Ut 19 24 21 15 15 18 11 25 21 19 34 x2 High
Read MB/s 3 5 9 13 24 35 46 55 59 71 72 109 Fast
% CPU Ut 21 23 26 18 21 22 38 33 35 42 47 x2 High
Block KB 1 2 4 8 16 32 64 128 256 512 1024
Run on AMD4 files AMD2
Write MB/s 3 4 5 7 8 13 13 14 14 15 17 28 Slow
% CPU Ut 7 7 7 9 6 7 6 5 5 6 5 x4
Read MB/s 2 5 8 13 26 39 39 45 44 47 45 42
% CPU Ut 10 9 7 3 6 10 9 12 8 14 13 x4
Run on AMD2 Files AMD4
Write MB/s 2 4 8 10 16 20 22 26 27 26 27 68
% CPU Ut 14 17 15 10 9 10 7 8 9 7 4 x2
Read MB/s 3 5 9 15 26 40 36 44 51 53 52 109
% CPU Ut 14 10 12 13 17 16 13 23 14 19 17 x2
Block KB 1 2 4 8 16 32 64 128 256 512 1024
Run on AMD2 Files on C2D
Write MB/s 2 4 8 13 15 18 19 23 24 22 24 41
% CPU Ut 16 18 16 17 10 10 7 10 7 5 5 x2
Read MB/s 5 6 10 20 39 52 52 58 62 64 64 53
% CPU Ut 22 17 15 12 20 22 11 12 19 21 18 x2
Run on C2D Files on AMD2
Write MB/s 3 4 5 7 11 12 13 14 15 17 17 28 Slow
% CPU Ut 21 18 13 12 10 9 8 8 10 8 7 x2
Read MB/s 5 10 10 20 38 45 44 51 54 55 55 42
% CPU Ut 42 54 24 33 24 39 20 28 29 37 29 x2 High
To Start
CDDVDSpd Benchmark
CDDVDSpd Benchmark can read files from any source and also write and read a combination of one large and 520 small files. In this case, large/small files used varied from 1MB/2KB to 32MB/64KB. Further details and results can be found in
CDDVDSpd Results.htm. The results from single pass of the large files are not necessarily accurate.
Below are example MB/second speeds with large files and milliseconds per file on the small ones. Using 100 Mbps LAN, maximum large file speeds were less than 11 MB/second between all systems.
As shown later, under Performance Monitoring, faster reading speeds are produced where the target system reads data from RAM based File Cache, in spite of the program requesting that this should not happen. Other variations can be caused as reading can start before all data is written to disk. This might explain why C2D to AMD4 reading speeds vary so much.
Other issues can be unclear, like consistently slower results for a time. Another problem seen is that, fairly regularly with AMD4 to C2D and occasionally using C2D to AMD4, the benchmark stops, indicating that one of the small files cannot be found. The stop message box enables the disks to be examined and, when done, showed that the files were all there. These errors are where the data is actually read from disk where, perhaps, the do not cache option should not be used when so many small files are written.
Later tests, with a version that enables caching, did not suffer from these failures.
Of particular note, speed on writing large files, at the lower end of the range shown, is generally slower than results available for disks connected by USB - See
USB Results.
This appears to be due to a large overhead for opening a single file for writing. Minimum time to write and read small files can also be slower, this being based on the average time for 520 files.
To Start
1000 Mbps LAN
AMD4 To C2D C2D To AMD4 AMD4 To AMD2 AMD2 To AMD4 AMD2 to C2D C2D to AMD2
Large Files
MB Write Read Write Read Write Read Write Read Write Read Write Read
MB/s MB/s MB/s MB/s MB/s MB/s MB/s MB/s MB/s MB/s MB/s MB/s
1 8.5 24.1 15.9 50.7 7.8 35.9 10.6 47.9 9.0 53.4 6.8 48.0
2 10.2 23.5 20.2 57.6 9.7 39.9 11.5 46.6 11.2 57.6 9.9 50.6
4 18.5 25.9 32.7 78.3 13.3 38.4 18.8 53.3 14.5 60.1 12.6 52.0
8 24.4 34.1 37.0 27.8 15.5 40.9 23.4 55.6 17.4 61.8 16.3 50.8
16 29.8 38.9 43.4 49.3 16.7 41.7 27.1 53.5 19.7 62.3 18.1 50.5
32 36.1 41.8 50.8 41.7 21.9 42.9 24.0 51.6 22.4 61.9 18.7 53.1
DiskGraf
Max 35 46 60 72 17 45 27 52 24 64 17 55
Small Files
KB Write Read Write Read Write Read Write Read Write Read Write Read
msecs msecs msecs msecs msecs msecs msecs msecs msecs msecs msecs msecs
2 2.6 2.4 4.6 3.4 5.6 3.0 6.3 5.1 3.7 3.8 3.8 2.9
4 2.8 2.2 4.6 3.5 4.1 2.8 5.7 4.6 3.9 3.7 3.8 3.1
8 2.8 3.4 4.3 3.3 5.1 2.9 5.9 5.5 3.8 3.7 4.0 3.1
16 3.0 3.4 4.6 3.6 4.5 3.1 6.1 4.8 4.0 3.7 4.2 3.2
32 3.2 3.2 4.8 4.0 5.4 3.3 7.1 6.1 4.4 3.8 4.6 3.5
64 4.3 5.2 5.7 5.1 8.7 4.3 6.8 6.3 4.8 4.6 6.0 4.1
DiskGraf
Block
2 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.4 0.2 0.4 0.2
64 2.0 1.3 1.8 1.4 5.1 1.6 2.9 1.8 3.3 1.2 4.8 1.5
Delete
Seconds 0.8 1.6 1.0 2.7 1.7 1.7
To Start
Setting Up Performance Monitor
This is arranged via Start, Run or Start Search with Vista, type Perfmon then press Enter - Vista administrative permission required. The program can also be started via Control Panel, Administrative Tools, Performance or Reliability and Performance with Vista. XP - select Performance Logs and Alerts, Counter Logs. Vista - select Data Collection Sets, User Defined. Via menu Action select New Log Settings or New Data Collection Set, XP - type name and OK then Add Counters. Vista - Create Manually, Next, tick Performance Counter, Next, Add.
Performance Object or Counter select and Add - Processor, % Processor Time - PhysicalDisk, Disk Read Bytes/sec, Disk Reads/sec, Disk Write Bytes/sec, Disk Writes/sec - Memory, Page Reads/sec, Page Writes/sec. Close or OK. [For LAN/network measurements add counters from Network Interface - Bytes Received/sec, Bytes Sent/sec, Packets Received/sec, Packets Sent/sec, possibly Output Queue Length, Discard and Error counters].
For other settings see Perfmon Help. Those used/changed were Sample Interval 1 second, log destination, log type Text Comma Delimited (CSV for spreadsheet), manual start/stop, 10000 samples Vista.
To Start
Performance Monitor - DiskGraf
Performance monitor was enabled to investigate inconsistent behaviour between systems. Below are results for AMD2/XP x64 and C2D/Vista 64 running Diskgraf with files on AMD4/Windows 7/64. The particular test uses sixteen 32 MByte files or 512 MB (537 MB decimal). In this case, CPU utilisation shown is the overhead on AMD4 and that for executing the programs is provided in DiskGraf logs - AMD2 average of 4.3% writing and 16.7% reading with the faster C2D using 28.6% and 37.6%.
AMD2 is clearly much slower on transmitting data to the same system but, surprisingly, with much lower CPU utilisation. Perhaps C2D/Vista is hanging around waiting for acknowledgement of previous packets and can therefore respond faster for the following data.
On reading, AMD2 speed is similar but slightly faster on average. The main feature is that data is not read from the AMD4 disk, as required and requested by the benchmark program. In this case, speed is limited by LAN performance but, with a slower disk, reading data transfer rate could be faster than the disk’s capability.
To Start
Monitoring AMD4 - Same Destination, Different Source PCs
Run on Athlon 64 Files AMD4 Run on C2D Files AMD4
Write 16 x 32 MB Files
Mbytes Mbytes Mbytes Mbytes % CPU Mbytes Mbytes Mbytes Mbytes % CPU
Secs Rec/s Sent/s Read/s Write/s Util Rec/s Sent/s Read/s Write/s Util
1 14.7 0.1 0.0 0.1 0 39.2 0.2 0.0 37.5 0
2 33.3 0.2 0.0 33.7 9 63.0 0.3 0.0 60.9 15
3 39.7 0.2 0.0 33.6 14 59.8 0.3 0.0 57.7 15
4 42.8 0.2 0.0 33.7 16 42.8 0.2 0.0 41.5 11
5 42.7 0.2 0.0 33.1 20 53.1 0.3 0.0 51.6 11
6 11.5 0.1 0.0 33.7 15 47.0 0.2 0.0 45.1 11
7 40.2 0.2 0.0 33.7 9 57.8 0.3 0.0 55.0 14
8 18.7 0.1 0.0 21.1 14 57.9 0.3 0.0 57.1 17
9 30.2 0.2 0.0 12.7 6 57.8 0.3 0.0 56.1 18
10 30.7 0.2 0.0 33.7 8 52.4 0.2 0.0 50.8 14
11 38.2 0.2 0.0 33.7 14 27.3 0.2 0.0 26.5 14
12 28.6 0.2 0.0 33.7 14
13 42.6 0.2 0.0 33.6 15
14 32.9 0.2 0.0 33.2 14
15 40.4 0.2 0.0 33.6 15
16 33.3 0.2 0.0 33.6 13
17 36.2 0.2 0.0 33.6 15
Total 556.8 3.1 0.0 504.2 212 558.2 2.7 0.1 539.9 141
Read 16 x 32 MB Files
Mbytes Mbytes Mbytes Mbytes % CPU Mbytes Mbytes Mbytes Mbytes % CPU
Secs Rec/s Sent/s Read/s Write/s Util Rec/s Sent/s Read/s Write/s Util
1 0.6 21.7 0.0 0.0 0 0.3 44.9 44.6 0.0 0
2 0.7 61.9 0.0 0.0 9 0.4 52.7 50.4 0.0 6
3 0.6 57.4 0.0 0.0 11 0.4 48.8 50.7 0.0 4
4 0.7 61.3 0.0 0.0 12 0.4 51.7 51.5 0.0 11
5 0.6 55.7 0.0 0.0 13 0.4 52.3 52.1 0.0 9
6 0.6 54.3 0.0 0.0 9 0.5 61.0 60.8 0.0 5
7 0.6 58.7 0.0 0.0 10 0.4 51.7 51.5 0.0 7
8 0.7 61.8 0.0 0.0 8 0.4 52.7 52.5 0.0 8
9 0.6 57.4 0.0 0.2 13 0.4 51.1 51.0 0.0 7
10 4.0 48.4 0.0 0.0 10 1.3 50.1 49.9 0.5 14
Total 9.6 538.5 0.0 0.3 96 4.7 517.0 514.9 0.6 72
For the above and other examples, with 1 second sampling, Bytes/Second have been converted to MBytes. % CPU is for two or four CPUs (four in this case).
To Start
Performance Monitor - CDDVDSpd Disk
Running CDDVDSpd with Performance Monitor enabled shows that unexpected fast performance on reading can be due to not reading from disk but transferring data from File Cache, as with DiskGraf. As shown below, the disk file might still be being written as the data is read, without slowing it down much.
Later results below are for two sets of tests between all three systems, showing monitoring statistics and measured performance. All show similar counts of LAN bytes received and sent for 67.7 KB (decimal) data plus acknowledgement bytes. Only AMD4 to C2D and C2D to AMD4, between Windows 7 and Vista, appear read data from disk and the amount written is larger. The others, where the do not cache command has been ignored, might not have finished writing before the files were deleted.
The lowest amounts of CPU time used on the measured target systems are where the data is sent from AMD4/Windows 7 and the highest with AMD2/XP sending.
Further tests for writing 520 files were run with special versions of the benchmark, one not using FILE_FLAG_NO_BUFFERING to enable File Cache. These versions have pauses to identify later disk activity. The programs were run on AMD4 to write files on C2D and AMD2 and results are below.
With AMD2 destination, speeds and data volumes of the two tests were effectively the same, using the remote File Cache in both cases. C2D did not cache the data when told not to but appeared to write more data. C2D cached speeds were the fastest but with the lowest volume of data saved during the timed period and two CPUs being used for some of the time (CPU seconds > measured time).
To Start
Run on Athlon 64 Files AMD4
AMD4
Mbytes Mbytes Mbytes Mbytes % CPU Up to
Secs Rec/s Sent/s Read/s Write/s Util Secs
1 9.2 0.0 0.0 0.1 5
2 25.7 0.1 0.0 22.3 12 1.98 Write 32 MB at 16.12 MB/sec
3 3.4 0.1 0.4 13.1 14
4 14.4 0.3 0.0 3.8 20
5 16.9 0.4 0.0 3.6 27 4.63 Write 520 x 64 KB at 12.28 MB/sec
6 1.8 33.5 0.1 8.9 12 5.19 Read 32 MB at 57.1 MB/sec
7 0.5 15.9 0.0 1.1 9
8 0.5 13.6 0.0 3.9 17 7.73 Read 520 x 64 KB at 12.79 MB/sec
9 0.4 5.9 0.0 4.4 17
10 0.4 0.4 0.0 2.5 17 9.66 Delete 521 files for 1.93 secs
Total 73.2 70.4 0.5 63.7 150
CDDVDSpd 32 MB Large File, 64 KB Small Files - Both Total 64 MB
Util Recv Send Write Read CPU Large MB/s Small MB/s Del
Log MB MB MB MB Secs Write Read Write Read Secs
AMD4 To C2D
C2D 73.4 74.5 74.5 68.8 3.2 34.8 40.0 14.2 10.8 0.9
C2D 73.4 74.5 74.3 68.9 2.8 35.1 37.3 12.8 11.1 1.0
C2D To AMD4
AMD4 73.9 71.1 75.4 69.3 4.0 46.2 31.9 11.7 12.2 2.0
AMD4 74.2 70.9 74.7 69.1 4.4 57.9 26.8 10.1 11.4 1.4
AMD4 To AMD2 C C
AMD2 71.5 69.9 55.8 0.0 3.4 20.7 57.7 8.1 15.4 1.1
AMD2 71.2 69.9 59.8 0.0 3.7 18.3 60.0 8.2 14.8 1.2
AMD2 To AMD4 C C
AMD4 73.3 71.1 66.0 0.6 6.2 32.4 61.0 10.4 11.2 2.9
AMD4 73.1 71.1 61.5 0.1 5.7 32.1 71.9 10.7 11.4 2.0
AMD2 To C2D C C
C2D 73.8 71.9 58.6 0.2 5.4 25.0 75.0 15.7 15.2 1.4
C2D 73.4 71.5 57.3 0.0 4.6 24.9 74.4 17.3 15.2 1.3
C2D To AMD2 C C
AMD2 72.6 70.5 55.0 0.0 4.1 19.6 75.4 12.1 20.8 1.3
AMD2 72.6 70.5 55.9 0.0 4.5 19.2 75.9 11.5 20.5 1.3
C No disk reading, data from File Cache
Writing 520 Files With and Without FILE_FLAG_NO_BUFFERING
Normal Cached
Each All Disk Total Disk Total
File Files Write MB MB MB MB CPU Write MB MB MB MB CPU
KB MB Secs Rec Sent Write Later Secs Secs Rec Sent Write Later Secs
AMD2
262 136 8.7 142 1.5 57 141 3.5 6.5 142 2.3 64 138 3.7
524 273 10.4 283 2.5 109 277 6.4 10.2 284 3.3 126 275 6.1
1049 545 15.6 557 3.4 287 557 10.7 15.4 567 5.6 262 549 11.5
C2D
262 136 4.4 144 2.7 153 same 1.8 2.5 144 2.7 36 140 2.8
524 273 6.9 288 4.8 291 same 3.1 4.0 288 4.6 81 277 4.1
1049 545 15.1 575 8.9 593 same 5.9 6.7 575 8.5 223 551 7.6
To Start
Performance Monitor - Copy/Paste 1 GB Files
Large data volume transfer tests were carried out between the three systems, using copy/paste with 1 GB video files (1.073 GB decimal). There were six tests via the the PC with the source files and six using the destination system (from/to and to/from). Below are a summary of results preceded by performance monitor details on copying from the Phenom based PC (AMD4) to the one with the Core 2 Duo (C2D). Besides a wide variation in measured copy/paste times, the most significant observation is that as low as 50% of the destination file might be saved when Windows indicates that the operation is finished. Also, during the added time, data is not written at full speed (lazy writing).
The system resources used appeared to be too high for some recordings on the Athlon 64 base PC (AMD2), whereby monitoring went unreported for periods - shown as ?.
AMD4 to C2D - Average reading/sending at almost maximum disk disk speed for first 11 seconds, then at destination disk speed. File 63% written at measured copying time.
See below for other results and summaries.
To Start
Source AMD4 Destination C2D
Mbytes Mbytes K Pack K Pack Mbytes Disk % CPU Mbytes Disk % CPU
Secs Rec/s Sent/s Rec/s Sent/s Read/s Reads/s Util Writ/s Wrts/s Util
1 0.4 23.2 4.2 1.0 30.3 29 7 0.0 10 15
2 1.4 95.9 17.4 4.1 90.3 86 13 10.0 41 36
3 1.3 90.0 16.3 4.0 90.3 86 21 19.4 22 41
4 1.3 85.9 15.5 2.5 84.8 81 14 27.5 32 40
5 1.3 91.3 16.5 2.8 86.1 82 17 34.8 43 40
6 1.2 83.1 15.1 2.5 88.2 84 14 26.3 30 41
7 1.1 76.3 13.9 2.4 73.5 70 13 43.1 44 38
8 1.1 76.7 13.9 2.3 75.6 72 11 42.5 45 40
9 1.2 83.0 15.1 2.5 84.0 80 15 44.2 45 38
10 1.3 87.4 15.8 2.6 86.1 82 10 44.5 50 44
11 1.0 72.8 13.0 2.3 77.7 74 14 43.1 64 33
12 0.4 29.5 5.2 1.0 27.3 26 4 27.1 70 31
13 0.4 25.9 4.7 1.2 25.2 24 13 40.5 118 14
14 0.3 20.1 3.6 0.9 22.8 22 5 44.7 61 12
15 0.7 44.4 8.0 2.1 44.1 42 10 39.9 66 20
16 0.5 31.7 5.7 1.5 31.5 30 7 36.6 66 13
17 0.5 34.8 6.2 1.6 33.6 32 12 37.4 50 16
18 0.4 26.3 4.7 1.2 22.0 21 4 36.6 63 9
19 0.0 0.0 0.1 0.0 0.0 0 7 43.7 48 2
Total 15.8 1078.5 195.0 38.4 1073.5 1024 213 642.1 968 523
CPU secs 8.6 10.5
Total After
50 secs 1078.6 1073.7 1154.5 588
To Start
Copy/Paste 1GB Files - Summary
1A AMD4 to C2D - Third fastest but only 60% of data written in this time.
1B C2D from AMD4 - AMD4 disk reading and data transfer slower at around 40 MB/second, possibly influenced by higher CPU time on C2D. File 82% written after 30 seconds.
2A C2D to AMD4 - Reading and transmitting at expected C2D disk speed of 50 to 60 MB/second but writing speed less than expected on AMD4, mainly varying between 20 and 40 MB/second (lazy writing?). File 56% written after 23 seconds and saving continued slowly for a further 50 seconds.
2B AMD4 from C2D - Copy more than twice as long as program running on C2D but 71% saved in this time. Disk reading/data transfer slow at 20 to 27 MB/second. Disk writing about 40 MB in a second followed by 10 KB in next second.
3A AMD4 to AMD2 - This is the fastest timed copy/paste at 19.3 seconds but the amount written in this time is unknown as AMD2 failed to report most samples. Disk reading and LAN traffic averaged 83 MB/second over first 11 seconds.
3B AMD2 from AMD4 - Probably the the best results with balanced reading/transmitting/writing at averages of around 40 MB/second and completing 90% in the timed 24.6 seconds.
4A AMD2 to AMD4 - Sinusoidal like read/send/write 0 to 45 MB/sec and average 23. Slow at 49.1 seconds but 79% written in this time. Some minimum output queuing on AMD2.
4B AMD4 from AMD2 - This time copy from is much faster at 19.6 seconds but only 48% is saved in this time. Again, AMD2 does not record properly but AMD4 bytes received per second is mainly greater than 6M. Disk writing speed is variable and slow for the first 10 seconds - average 18 MB/second.
5A AMD2 to C2D - This is the second slowest at 61.9 seconds but 95% of the writing is completed. Reading from AM2 is again sinusoidal like at 0 to 50 MB/second with disk writing being less variable. There were again indications of AMD2 output queuing.
5B C2D from AMD2 - Slowest copy at 68.6 seconds and 93% written. AMD2 disk reading speed and data transmission varied up to 37 MB/second but C2D disk writing speed was more constant, mainly 12 to 18 MB/second and further AMD2 output queuing.
6A/B C2D to AMD2 and AMD2 from C2D - these were fairly fast at 24.0 and 22.3 seconds but AMD2 recording failures prevented logging details of how much data was written. C2D disk reading and sending data was at an average of 55 MB/second with C2D CPU utilisation at 100% of one CPU, running copy/paste on C2D, and 66% using AMD2.
Local Disk to Disk - Note copy/paste for 1 GB files on the fastest local disk (AMD4) took typically 23 seconds using 11.5 CPU seconds. Combined reading/writing speed was around 64 MB/second but only about 50% of the data was saved in this time, the remainder being lazily written over a further 30 seconds - 17 MB/second.
To Start
A Copy From/To B Copy To/From
Source Destination Source Destination
File File Clock CPU CPU MB Clock CPU CPU MB
Source Dest Time Secs Secs Write Time Secs Secs Write
1 AMD4 C2D 19.9 8.6 10.5 642 29.8 10.8 21.0 840
2 C2D AMD4 22.7 19.5 9.9 577 47.3 27.4 17.3 731
3 AMD4 AMD2 19.3 12.0 ? ? 24.6 11.1 11.2 918
4 AMD2 AMD4 49.1 9.3 14.3 807 19.6 ? 11.6 493
5 AMD2 C2D 61.9 9.8 15.9 972 68.6 7.9 20.4 952
6 C2D AMD2 24.0 23.5 ? ? 22.3 12.9 ? ?
To Start
Copy/Paste Many Files
These tests use a folder containing 857 files with total size of around 52 MBytes (decimal) - 60 KB average file size.
To demonstrate consistency, two sets of results are provided for twelve copy to and copy from tests between the three systems. Copying times were between 14 and 45 seconds, a major surprise being that these could be slower than measured from a disk connected via USB
(See USB results).
CDDVDSpd benchmark results above also demonstrate that USB connected disks can be faster.
As with the other monitoring measurements, approximately the same number of bytes sent and received were recorded on the source and destination PCs. AMD4/Win7 and C2D/Vista incur writing of a large amount of additional data, over up to a minute or more, after timing was finished. Then using the same two systems as source with “copy to” leads to extra data being received. Respectively, these two anomalies were later identified as due to indexing for search purposes and unacceptable behaviour introduced by AVAST anti-virus software.
There is little sign of performance being limited by the CPU time used, but the variations are surprising.
The data used for these tests includes 400 tiny GIF files, each 71 Bytes but using a total of 1.9 MB disk space. Later results shown are averages of three runs copying just these GIF files and the data without them. These were repetitions, where the data would be cached in RAM on the source computer.
Packets sent and received are also shown, where calculations indicate that average packet size is around 250 bytes for the tiny files and average more than 1 KB for the others.
In terms of MB/second, the larger files were transferred twelve times faster.
The last example is when using multiple copies of the larger files with normal copying followed by one where the data is already in the source PC’s File Cache. Results are averages of three tests that were all quite similar, anyway. It can be seen that the volume of data written to disk is quite low, over the copying time indicated by Windows, where the average data transmission sending speed is still a disappointing 12 to 13 MB/second.
The high number of bytes received is mainly associated with 14 MB of PDF and HTML files, the AVAST effect.
To Start
File File Clock MB MB MB CPU MB Total CPU
Source Dest Secs Rec Sent Read Secs Write Later Secs
AMD4 C2D
AMD4 C2D To 14.1 45.8 63.4 46.7 7.9 49.5 75.5 4.6
AMD4 C2D To 14.4 47.6 64.3 58.5 8.6 54.8 80.7 6.4
AMD4 C2D From 16.8 4.2 72.0 53.0 7.2 49.4 72.3 11.0
AMD4 C2D From 17.5 4.2 72.0 54.2 4.9 50.0 73.9 8.9
C2D AMD4
C2D AMD4 To 17.3 39.1 59.1 52.5 9.6 51.6 111.7 8.3
C2D AMD4 To 18.2 39.2 59.2 52.3 9.9 62.5 114.6 11.4
C2D AMD4 From 13.7 3.5 69.7 52.5 3.0 39.3 111.9 9.2
C2D AMD4 From 14.2 3.6 69.8 61.0 3.5 34.2 95.0 9.4
AMD4 AMD2
AMD4 AMD2 To 37.1 38.2 55.7 43.2 13.6 58.5 60.9 6.1
AMD4 AMD2 To 37.9 38.3 55.9 56.6 19.1 58.2 60.0 6.4
AMD4 AMD2 From 42.3 4.6 56.1 53.5 9.9 53.9 61.1 9.0
AMD4 AMD2 From 42.6 4.6 56.0 52.8 9.2 53.0 61.1 9.3
AMD2 AMD4
AMD2 AMD4 To 44.9 5.8 57.9 68.0 8.9 83.8 110.1 20.5
AMD2 AMD4 To 45.2 5.9 58.1 61.7 9.9 129.4 152.3 27.9
AMD2 AMD4 From 38.8 3.8 67.0 62.9 3.7 89.3 113.6 23.4
AMD2 AMD4 From 35.0 3.8 67.1 62.2 3.7 82.0 105.0 20.1
C2D AMD2
C2D AMD2 To 31.1 39.1 56.6 52.5 12.5 56.6 59.8 7.7
C2D AMD2 To 35.2 39.2 56.7 52.5 14.2 57.7 60.4 7.9
C2D AMD2 From 30.0 4.7 56.1 52.4 5.6 46.3 61.6 7.6
C2D AMD2 From 31.7 4.7 56.1 52.4 5.4 48.3 61.8 7.4
AMD2 C2D
AMD2 C2D To 33.2 6.3 58.2 61.7 9.8 86.4 100.2 16.8
AMD2 C2D To 38.1 6.3 58.2 63.0 9.6 85.5 138.3 16.0
AMD2 C2D From 31.7 4.1 71.7 62.7 4.7 69.2 86.1 16.0
AMD2 C2D From 34.8 4.1 71.8 61.8 5.2 73.4 101.9 15.3
Dest
Source MB K Packets MB Total CPU
Secs Rec Sent Rec Sent Write Later Secs
400 x 71 Byte Files Cached (no disk reading) - Files Minimum 1.64 MB on disk
AMD4 C2D To 4.0 2.2 1.7 6.6 6.2 2.6 27.5 1.1
AMD4 AMD2 To 9.5 2.0 1.7 8.9 9.1 2.8 2.8 1.0
Other Files Cached - now, 51.5 MB, 428 files, average size 120 KB
AMD4 C2D To 10.3 39.1 58.7 48.4 54.1 44.6 81.8 3.9
AMD4 AMD2 To 22.6 35.4 54.4 21.0 49.0 42.1 57.4 2.7
Larger Files Uncached and Cached - 51.3 MB, 247 files, average size 208 KB
AMD4 C2D 52 MB Read 6.0 26.0 52.7 29.6 41.1 19.6 66.4 1.5
AMD4 C2D 0 MB Read 4.3 25.9 52.6 29.2 40.9 13.8 72.6 1.5
To Start
Small Files and Packet Size
As indicated above, speed is very slow on copying small files and the packet size used might not be as expected. Further tests were run to possibly identify the cause. These were all writing from AMD4/Win 7 to C2D/Vista, from repeated tests where data is read from RAM based File Cache rather from disk.
Copy/Paste - The 400 tiny files were resized to between 0.8 KB and 5.4 KB each, in different folders (using Photoshop Elements 2.0 Batch Processing).
Reported results are averages of three separate tests. The 4.2 seconds copying time at 0.8 KB was not much slower than using 71 bytes (above). The time did not increase much with 1.4 and 2.8 KB files at 4.3 to 4.5 seconds. For these, Perfmon shows that data is being transferred between 2 and 3 seconds when CPU utilisation of one processor is 60% to 70%.
Also, an average of 17 packets per file appear to be sent, irrespective of overall megabytes but with some increase in apparent bytes per packet.
Online Tests - The original 400 GIF files are provided for timing how fast they can be loaded into a web browser. The tests can be run from
OnLine benchmarks.htm
or locally from files in
OnLineTests.zip.
Modified HTML files were produced to load the resized files used for the Copy/Paste test and Internet Explorer 8 was used to load them.
The load/display time was measured with a stopwatch and is faster than that above as there is no overhead introduced by the copy/paste progress monitor. Data and packet volumes are somewhat similar.
CDDVDSpd Benchmark - A modified version of this benchmark was used to provide a wider range of results without the need of a stopwatch. This program produces a message box after each write/read phase to avoid overlapped activity and a pause in Perfmon recording. It also creates files without FILE_FLAG_NO_BUFFERING, so that the data is cached in RAM at the destination PC, as applicable to copy/paste and browser use. The figures below are for writing 520 files.
Number of Packets and Size - Based on previous (old) experience, it was anticipated the maximum packet size for sending data would be around 1500 bytes, minimum 64 bytes, and, for each, there would be an acknowledgement packet (ACK) of about 64 bytes. Now it seems that ACKs can be returned for multiple packets. The number is said to be up to 13 for gigabit Ethernet, via TcpAckFrequency Registry settings, but no entries could be found on the two systems used. For the results below, up to four times more packets were received than sent on system C2D, where the size of data packets is identified as up to nearly 1500 bytes. Corresponding size on the Windows 7 PC was up to over 20 thousand bytes with far fewer packets for the same total data volumes.
LAN Extra Data - There appears to be an overhead of 0.7 MBytes or more on total data bytes sent by AMD4 (for 520 files) and 0.6 MB or greater for data received. These, and calculated packet sizes, suggest that C2D is sending some data besides ACKs.
Disk Less and Extra Data Written - For the larger files, according to Windows based timing, as low as 35% of the data might be saved to disk when sending is said to be finished, and data will be written, possibly more slowly (lazy writing) for a further 20 seconds. At the same time and/or later, additional disk writing is recorded. This amounts to around an extra 27 MB for 520 files of any size, typically within 40 seconds after LAN data transmission is finished. Perfmon, Resource Overview, Disk Activity can identify the reason for the extra data (Vista sorted by Write Bytes/Minute using Stop and Start recording). This shows that most of the extra data is due to SearchIndexer and NTFS Volume Log. Unlike running the benchmark locally, some of the data files (3 MB on 1049 KB test) are recorded as up to 1570 KB being written (but shown as 1049 KB on Disk Properties), suggesting that part of the files might be written twice.
Data Transmission Speed - With the data cached in RAM on the destination PC, disk writing might not interfere with transmission speed. That, in MB/second, is shown in the table for the Windows timed part. Speed is shown to vary from 1.4 MB/second, for the smallest files, up to 78.9 MB/second for the largest.
To Start
Copy/Paste 400 Files Write Only, AMD4/Windows 7 Recordings
File Files Source MB K Packets Packet Bytes
KB MB Secs Rec Sent Rec Sent Rec Sent
AMD4 > C2D 0.8 0.3 4.2 1.5 1.9 6.8 6.8 220 282
AMD4 > C2D 1.4 0.6 4.3 1.5 2.2 7.2 6.8 211 319
AMD4 > C2D 2.8 1.1 4.5 1.5 2.7 7.2 6.8 211 403
AMD4 > C2D 5.4 2.2 5.3 2.3 5.1 11.1 10.0 206 508
HTML OnLine Test 400 Files, AMD4/Windows 7 Recordings
File Files Source MB K Packets Packet Bytes
KB MB Secs Rec Sent Rec Sent Rec Sent
AMD4 > C2D 0.8 0.3 3.0 1.8 2.0 8.2 7.9 214 252
AMD4 > C2D 1.4 0.6 3.0 1.8 2.1 8.2 7.9 215 272
AMD4 > C2D 2.8 1.1 3.3 1.7 2.5 8.2 7.8 214 320
AMD4 > C2D 5.4 2.2 3.2 1.9 2.7 9.1 8.7 210 307
CDDVDSpd Write 520 Files - Program on AMD4/Windows 7, files on C2D/Vista
Win 7 Vista
Each All MB/ Packet Size AMD4 Packet Disk Total C2D
File Files Write MB MB sec Bytes Bytes CPU Size MB MB MB CPU
KB MB Secs Rec Sent Sent Rec Sent Secs Rec B Write Later Extra Secs
2.0 1.1 1.3 0.9 1.8 1.4 229 569 1.1 499 3.6 27 26 0.8
4.1 2.1 1.2 0.8 2.7 2.2 229 905 1.1 701 3.7 28 26 1.0
8.2 4.3 1.0 0.9 5.0 5.1 219 1577 1.4 900 4.1 29 25 1.0
16 8.5 1.3 0.9 9.3 7.3 192 2920 1.0 1089 4.9 36 27 0.9
33 17 1.5 1.0 18 11.9 159 5615 1.4 1275 8.1 46 29 1.2
66 34 1.7 1.2 35 20.5 131 9517 1.1 1377 5.5 65 31 0.9
131 68 2.1 1.7 69 33.5 110 12084 1.5 1422 14 98 30 1.7
262 136 2.9 2.9 138 47.6 102 16821 2.1 1470 35 167 31 2.6
524 273 4.3 4.9 274 63.3 93 23114 3.0 1494 101 300 27 4.9
1049 545 7.0 8.9 549 78.9 88 27565 3.3 1503 190 572 27 8.0
dddd dddd dddd rrrr ssss bbrr bbb1 bbb2 dddd dddd dddd
dddd Decimal
rrrr ssss Rec/Sent same as C2D Sent/Rec
bbb1/2 AMD4 Sent number of packets far fewer, hence larger size
bbrr Packets/bytes received same as C2D sent
To Start
Small Files eSATA, LAN, USB
A version of the standard CDDVDSpd benchmark, with caching disabled and pauses between test functions, was run accessing a disk drive capable of running at greater than 100 MB/second. The drive is in an external enclosure connected either by eSATA or USB 2.0 to system C2D. Besides tests directly via the latter, others were run from AMD4 via the LAN to eSATA. Sample results given are below for writing 520 files of 2 KB to 1024 KB (binary), or 1.1 MB to 545 MB (decimal) overall.
For all tests, the disk properties were set to “Optimise For Performance”.
In this case, all the data (except a little) appears to be written to disk within the timed period. This demonstrates a speed on writing the largest files directly to eSATA of more than 100 MB/second, where there is an overhead of only 3 MB, mainly for NTFS Volume Log. On the other hand, the overhead via LAN is up to 48 MB, with more than 6 MB shown as NTFS Volume log and the remainder indicated as writing more than 1049 KB (1024 KB binary) to many of the files. For real data, maximum data transfer speed via LAN is reduced to less than 40 MB/second, with CPU time being somewhat higher and maximum block size probably restricted to 64 KB.
The main observation of these tests is that data transfer speed via USB is faster than LAN on writing small files of up to 64 KB each. Estimates of LAN speed are an overhead of 2.4 milliseconds per file and transfer speed of 44 MB/second, compared with USB 1.2 milliseconds and 29 MB/second.
HTML Folders - Further tests were run copying a folder where HTML files are saved. This occupies 19.1 MB (decimal) with 3683 files, many being tiny GIF images, giving average size of 5.2 KB.
For theses tests, copying directly via eSATA and USB again produced similar performance and around twice as fast as transferring data over the LAN. All write about the same volume of data, most within the timed period, but 37 MB greater than that the original 19.1 MB. Perfmon Disk Resource Overview shows that the extra data is mainly for NTFS Volume Log.
The copy function was executed on AMD4 to send the data to C2D. A second test was run via C2D to copy data from AMD4. This time, copying time was longer. The amount of data sent and received was also different but the existing tools do not appear to be capable of identifying the reasons why so much extra data is involved.
Zipped Data - In order to compress the data into a single file and copy it via a BAT command line file, the 7-Zip package was downloaded
from here
After installing, 7z.exe and 7z.dll were copied to the same folder as the BAT files, where a sample command for an HTML folder is 7z a -tzip v:\fromLAN2\TH1.zip TempHTML1.
Copying the data was at least four times faster, the bulk of the measured time being CPU activity compressing the data, where 7z appears to use more than one processor at the same time. Compressed files are 10.3 MB (decimal).
To Start
|
LAN
Files Wrt Rec Sent Wrt Blk CPU
MB Secs MB MB MB KB Secs
1.1 1.5 1.8 0.8 4.0 7 0.8
2.1 1.4 2.9 0.8 5.3 10 0.7
To
273 6.9 288 4.8 291 62 3.1
545 15.1 575 8.9 593 62 5.9
USB eSATA Both
Files Wrt CPU Wrt CPU Wrt Blk
MB Secs Secs Secs Secs MB KB
1.1 0.7 0.5 0.6 0.5 4.1 8
2.1 0.7 0.5 0.4 0.4 5.2 10
To
273 10.0 1.8 3.0 0.6 276 502
545 19.5 2.7 5.3 1.0 548 973
|
Copying HTML Folder
C2D LAN C2D External C2D AMD4 AMD4
Wrt Rec Sent K Packets Read Wrt Blk Wrt CPU Read CPU
Secs MB MB Rec Sent MB MB Wrts Later Secs MB Secs
LAN 26.6 34.8 26.3 76.9 75.8 0.0 52.0 2320 57.2 12.4 20.3 15.1
Local 13.9 23.2 45.6 2823 56.7 10.7
USB 14.1 20.7 47.1 3246 56.6 12.3
Copy
From 34.8 50.3 13.6 78.4 63.1 0.0 58.1 4158 60.0 26.6 21.2 12.8
Zipped 5.9 12.3 1.4 13.6 10.2 0.0 6.8 98 10.5 1.0 18.7 6.5
To Start
Large PDF Files
Unusual results observed earlier on copying PDF files were confirmed. The test folder is 93.4 MB (decimal) with 34 files sized between 100 KB to 8.3 MB. Using C2D/Vista and ATH4/Windows 7, to copy the folder to the other system, resulted in more than 230 MB of data being returned.
Executing the copy function on the destination PC (Copy From) did not incur this overhead. Copying on AMD4 and C2D to AMD2/XP resulted in the additional data but did not for AMD2 to AMD4 and C2D.
The files were renamed as .NOT instead of .PDF and there was no additional data. Then .DOC, .HTM and .XLS produced the extra data but renaming as .TXT and .ZIP did not.
AVAST - All three PCs have the latest free version of AVAST anti-virus software installed. Initially, it was thought that this would not affect the slow performance as this did not apply with AMD2 as the source PC. Later, AVAST On-Access Protection was turned off and it was found that the one on the source PC was responsible for apparently reading data from the destination system (see results below).
For Local copy/paste on all three PCs, Task Manager shows AVAST reading the 220+ MB and, with Perfmon showing no extra reading from disk, indicates that the extra data was read from file cache. Using Copy From, AVAST only reads data on the destination PC (from cache). Using Copy To on AMD2 source, again produces AVAST reading only on the destination system. With Copy To from AMD4 and C2D sources, besides the extra data being received, the AVAST reading is shown for both source and destination PCs. This indicates that AVAST is reading cached data over the LAN from the destination RAM.
As usual with “Optimise For Performance” settings, all of the data is not written to disk when copying is indicated as being completed. Also seen before, but with image files, it appears that some of the data can be prefetched to RAM, as demonstrated by only 31 MB being read on AMD4. Note that the PC was rebooted between tests or alternative folders used.
To Start
Source Destination
File File Clock MB MB MB CPU MB Total CPU
Source Dest Secs Rec Sent Read Secs Write Later Secs
Copy To
AMD4 C2D 7.9 237 96 31 3.6 43 108 5.1
C2D AMD4 9.9 236 100 84 6.5 32 98 3.4
AMD4 AMD2 10.1 233 96 31 4.6 90 101 1.8
AMD2 AMD4 7.8 0.8 94 94 1.3 22 98 2.6
C2D AMD2 11.1 234 97 94 6.6 83 97 2.8
AMD2 C2D 9.1 1.6 94 94 1.6 61 110 3.8
Copy To AVAST Off
AMD4 C2D 5.0 1.5 96 93 1.2 39 107 1.2
C2D AMD4 5.2 0.8 99 93 2.5 14 96 1.4
Copy From
AMD4 C2D 5.1 0.9 95 31 1.2 39 107 3.9
C2D AMD4 4.4 0.7 98 98 2.2 20 96 2.5
Rename .PDF to .NOT - Copy To
AMD4 C2D 4.1 1.5 94 94 1.7
To Start
Roy Longbottom May 2010
The new Internet Home for my PC Benchmarks is via the link
Roy Longbottom's PC Benchmark Collection
|